BAAC: Bangor Arabic Annotated Corpus
نویسندگان
چکیده
منابع مشابه
Curras: an annotated corpus for the Palestinian Arabic dialect
In this article we present Curras, the first morphologically annotated corpus of the Palestinian Arabic dialect. Palestinian Arabic is one of the many primarily spoken dialects of the Arabic language. Arabic dialects are generally under-resourced compared to Modern Standard Arabic, the primarily written and official form of Arabic. We start in the article with a background description that situ...
متن کاملAutomatic Creation of Arabic Named Entity Annotated Corpus Using Wikipedia
In this paper we propose a new methodology to exploit Wikipedia features and structure to automatically develop an Arabic NE annotated corpus. Each Wikipedia link is transformed into an NE type of the target article in order to produce the NE annotation. Other Wikipedia features namely redirects, anchor texts, and inter-language links are used to tag additional NEs, which appear without links i...
متن کاملThe Penn Arabic Treebank: Building a Large-Scale Annotated Arabic Corpus
From our three year experience of developing a large-scale corpus of annotated Arabic text, our paper will address the following: (a) review pertinent Arabic language issues as they relate to methodology choices, (b) explain our choice to use the Penn English Treebank style of guidelines, (requiring the Arabic-speaking annotators to deal with a new grammatical system) rather than doing the anno...
متن کاملTransliteration of Arabizi into Arabic Orthography: Developing a Parallel Annotated Arabizi-Arabic Script SMS/Chat Corpus
This paper describes the process of creating a novel resource, a parallel Arabizi-Arabic script corpus of SMS/Chat data. The language used in social media expresses many differences from other written genres: its vocabulary is informal with intentional deviations from standard orthography such as repeated letters for emphasis; typos and nonstandard abbreviations are common; and nonlinguistic co...
متن کاملAnnotated Hungarian National Corpus
Zoltan Alexin Department of Informatics University of Szeged [email protected]—szeged.hu Tibor Gyinnithy Research Group on Artifical Intelligence at University of Szeged [email protected]—szeged.hu Csaba Hatvani Department of Informatics University of Szeged [email protected]—szeged.hu LaszlO Tihanyi MorphoLogic Budapest [email protected] Janos Csirik Department of Informatics University of Szeged csiri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Advanced Computer Science and Applications
سال: 2018
ISSN: 2156-5570,2158-107X
DOI: 10.14569/ijacsa.2018.091120